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We introduce a generalized bootstrap technique for estimators 
obtained by solving estimating equations. Some special cases of this 
generalized bootstrap are the classical bootstrap of Efron, the delete- 
d jackknife and variations of the Bayesian bootstrap. The use of the 
proposed technique is discussed in some examples. Distributional con- 
sistency of the method is established and an asymptotic representa- 
tion of the resampling variance estimator is obtained. 



1. Introduction. One of the most popular ways of obtaining estimators 
for parameters in statistics is by solving "estimating equations." Examples 
are abundant in the contexts of quasi- likelihood methods, time series, bio- 
statistics, stochastic processes, spatial statistics, robust inference, survey 
sampling and other areas. Godambe (1991) and Basawa, Godambe and Tay- 
lor (1997) contain extensive discussions on estimating equations. In this pa- 
per we introduce a generalized bootstrap technique for estimators obtained 
by solving estimating equations. 

We use the following framework: Suppose {cj) n i{Z n i, (3), 1 < % < n, n> 
1} is a triangular sequence of functions taking values in MP, {Z n {\ being 
a sequence of observable random variables and (3 £ B C MP. Assume that 
E(p n i(Z n i, (5q) = 0, 1 < i < n, n > 1 for some unique /?o 6 £>■ The "parameter" 
Po is unknown, and its estimator (3 n is obtained by solving (often uniquely) 
the estimating equations 



Typically, {(fi n i(Z n i, (3q)} form a triangular array of martingale differences. 
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The major objective of this paper is to estimate the sampling distribution 
and the asymptotic variance of (3 n by a new approach to resampling. We 
define our resampling estimator (3 b as the solution of 

n 

(1-2) ^2w ni (j) n i(Z n i,i3) = 0, 

i=l 

where {w n i, 1 < i < n, n > 1} is a triangular sequence of random variables, 
independent of {Z n i}. These are the "bootstrap weights." Note that essen- 
tially the same algorithm computes /3 n and the Monte Carlo samples of (3b- 
This makes the proposed bootstrap software friendly. 

The normal equations X) x m(?/ni — x ni/^) = f° r the least squares estima- 
tor (LSE) in linear regression is a special case of (1.1). With (w n i, . . . , w nn ) ~ 
Multinomial (n, 1/n, . . . , 1/n) we get the paired bootstrap (PB) estimator 
from (1.2). Other choices of w n iS yield the delete-d jackknives, the Bayesian 
bootstrap, the m-out-of-n bootstrap and variations of these. Hence we refer 
to resampling by (1.2) as the generalized bootstrap (GBS). Origins of the 
concept of resampling equations may be traced back to Freedman and Pe- 
ters (1984) and Rao and Zhao (1992), where the bootstrap was carried out 
using equations, as distinguished from resampling observations or residuals. 
Note that the GBS technique is different from the bootstraps suggested by 
Lele (1991) and Hu and Kalbfleisch (2000) for estimating equations. 

In Section 2.1 we state the conditions on GBS weights. In Section 2.2 we 
briefly discuss some examples of GBS schemes. Since every choice of distri- 
butions of the bootstrap weights corresponds to a different GBS technique, 
it is of interest to compare their relative performances. A theoretical com- 
parison of different GBS techniques is under study, and some preliminary 
results may be found in Bose and Chatterjee (2002). Section 2.3 contains ex- 
amples to illustrate the implementation of GBS. The standard GBS schemes, 
obtained by taking i.i.d. or multinomial weights, appear to perform compet- 
itively in a variety of problems, although there is some model and sample 
size dependent performance variation. 

In Section 3.1 we assume p = 1 and establish asymptotic linearizations of 
(3 n and (3b- The distributional consistency of the GBS follows easily from 
these. In Section 3.3 we consider models with increasing dimension by letting 
p — > oo as n — > oo and establish similar results. 

For the distribution of linear regression M-estimators, our results in this 
paper imply that the GBS is consistent even when regressors are random, 
errors are heteroscedastic or parameter dimension is increasing with sample 
size. This may be compared with Lahiri (1992), where a nonnaive residual 
bootstrap (RB) was found to be second-order accurate when covariates are 
nonrandom, errors are i.i.d. and the parameter dimension is fixed. While 
first-order consistency of GBS is achieved under relaxed assumptions, the 
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GBS is second-order accurate only after a complicated bias correction and 
Studentization. 

In Section 3.2 for dimension p = 1, we obtain an asymptotic representation 
for the GBS variance estimator, similar to the work of Liu and Singh (1992) 
and Hu (2001). Our result implies that for the asymptotic variance of linear 
regression M-estimators the GBS is consistent even when the errors are 
heteroscedastic, and yet can have greater asymptotic efficiency than some 
resampling schemes that are consistent only under homoscedasticity. 

The technical framework used here is for estimating equations similar to 
M-estimation problems. However, the underlying principle of GBS may be 
applicable to a much wider class of statistical problems. 

2. GBS weights: conditions and examples. In this section we spell out 
the technical conditions needed on the GBS weights and give examples of 
classes of weights which satisfy these conditions. We also illustrate the im- 
plementation of GBS through a few examples. 

2.1. Conditions on bootstrap weights. Let {w n i; 1 < i < n, n > 1} be a 
triangular array of nonnegative random variables such that for each n, the 
weights w n i, . . . , w nn are exchangeable. These are to be used as weights and 
we drop the suffix n from the notation. Pb and Eb, respectively, denote 
bootstrap probability and expectation conditional on the data. Let 

V(wi) = al, Wi = (wi — l)/cr n , 

-|l/2 



Ci jk ... = E(WiW>Wj?---) and 



sup ^E{c T (j) n if 



c =1 



i=l 



In the conditions below, p is the dimension of the parameter space, which 
is allowed to tend to infinity with data size n in Section 3.3. 

The first set of conditions is fairly universal and is satisfied by all known 
examples of bootstrap weights. 

BW (Basic conditions): 

(2.1) Ewi = 1, 

(2.2) 0<a 2 n = o(mm(alp-\n)), 

(2.3) cn = 0(n- 1 ). 

Schemes like the classical bootstrap and the delete-d jackknife satisfy 
J27=l w m = C n for some nonrandom sequence {C n }. This implies that c\\ = 
— l/(n — 1) and thus (2.3) is satisfied. 

Additional assumptions required for distributional consistency are: 
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CLTW (Conditions for GBS CLT): 

(2.4) c 2 2 1, c 4 < oo. 

For variance estimation, we need the basic conditions, (2.5), (2.6) and 
either part (a) or part (b) of (2.7) stated below. 

Let C + C (0, oo) be a compact set, and let W be the set on which at least 
mo of the weights are greater than some fixed constant ki > 0. 

VW (Conditions for GBS variance): 

(2.5) P B [W] = l-0 P (n~ 1 ), 

k 

(2.6) <V 2 -j fe = 0(n -fe+ V re -1 ) Vii,i 2 , ■•■ ,ik satisfying = 3, 

j'=i 

fc 

c^...^ = 0(min(n _/c+2 , 1)) Vix,...,i k with J^i,- = 4, 

i=i 

c^...^ = 0(n~ fc+2 ) Vii,...,i fc with ]T\j =4. 

j"=i 

In (2.6) and (2.7) the ij's are positive integers. In the following we refer to 
conditions (2.5), (2.6) and (2.7) (a) as VW(a) and to conditions (2.5), (2.6) 
and (2.7)(b) as VW(b). 

2.2. Examples of GBS weights. We now list some common resampling 
techniques that are special cases of GBS. 

(a) Suppose w n = (w n \, . . . ,w nn ) ~ Multinomial (n; 1/n, . . . , 1/n). These 
weights can be interpreted as simple random sampling with replacement 
of the functionals to minimize and essentially correspond to the classical 
bootstrap of Efron (1979). Apart from BW, these weights also satisfy CLTW 
and VW(a). 

Suppose instead that we select m data points out of n where typically 
m — > oo and m/n — > 0. If the selection is with replacement, the weights are an 
appropriately rescaled random sample from Multinomial(m; 1/n, . . . , 1/n). 
This scheme is usually called the m-out-of-n bootstrap. If the selection is 
without replacement, the scheme can be identified with the delete-(n — m) 
jackknife. For either situation, BW and CLTW hold. 

See Praestgaard and Wellner (1993) for other variations and adaptations 
of the classical bootstrap. 

(b) The Bayesian bootstrap [Rubin (1981)] and its variations [see Zheng 
and Tu (1988) and Lo (1991)] essentially use w n ~ Dirichlet(a, . . . , a). The 



(a) ^GC+; 

(2.7) 

(b) o2-0; 
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weighted likelihood bootstrap of Newton and Raftery (1994) is also a varia- 
tion, where <f> n i(') has a log-likelihood interpretation. The conditions BW, 
CLTW and VW(a) are satisfied. 

(c) The jackknives are specially geared towards estimation of bias and 
variance. Suppose 6 n is an estimator based on n observations and we wish 
to estimate its variance. 

In its simplest form, the delete- 1 jackknife estimator is obtained as follows: 
Drop the ith observation and recompute the estimator, say 6 nt i, on the 
basis of the remaining n — 1 observations. Then the jackknife estimator of 
the variance is v = (n — l)n~ 1 J27=i(^n,i — On) 2 - To visualize the delete-1 
jackknife as coming from a sequence of random weights, consider all vectors 
iji, 1 < i < n, of length n where the ith. coordinate of rji is zero and the rest are 
1. Let P(w n = n{n — l)~ l rji) = 1/n for 1 < i < n. The above estimator is then 
obtained after appropriate averaging over this uniform weight distribution. 

The delete-d jackknife deletes d observations at a time and has a similar 
interpretation. If n — d — > oo, then BW holds. If d/n ->c£ (0, 1), then CLTW 
and VW(a) hold. If d/n -» 0, then VW(b) holds. 

The downweight-d jackknife is a variation of the above. For 1 < d < n 
consider the n-dimensional vectors 7?n: 11,12,...,^ where the jth coordinate of 
Vn-.h ...id is d/n if j is one of ii,...,id, else it is (n + d)/n. The resampling 
weights vector is a random sample from the set of n. The asymptotic prop- 
erties of these weights are similar to the delete-d jackknives. However, since 
no observation is assigned a weight zero, model assumptions like (3.16) are 
not needed. 

2.3. Examples on implementation of GBS in some models. We consider 
three examples in this section. Important non-GBS techniques such as the 
RB and the wild bootstrap (WB) are also included for comparison. 

Example 2.1. Heteroscedastic time series: Consider the following model: 
X t = 4>X t -i + et, t = 1, . . . , n, where Xq = 0, and {ej} is a sequence of inde- 
pendent, normal, mean-zero random variables with Ee\ = a\ if t is odd and 
Ee\ = 0-% is even. Suppose that the unknown <f> is estimated by the LSE 
4> = Y, XfXt-i/ E x t-i ■ Let V n = E(^/n(4> - <j))) 2 be the quantity of interest 
to be estimated using resampling techniques. In general (ft, a\ and a\ are 
unknown. For simulation purposes we let (ft = 0.2, a\ = 1 and a\ = 100. 

We study the wild bootstrap (WB) [Wu (1986) and Mammen (1992)], 
GBS(l) with Multinomial(n, 1/n, . . . , 1/n) weights and GBS(2) with i.i.d. 
Uniform(0.5, 1.5) weights. For simplicity, we use i.i.d. iV(0, 1) weights for 
the WB in all examples in this section. We used 10,000 simulations and 
bootstrap sample size of 1000 on each of the four resampling techniques. 

In Table 1 the first column indicates the sample size. The value of V n 
depends on n, but is approximately 0.11 for all three n values reported. 
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Table 1 

Mean (and variance) of estimates ofV n from heteroscedastic AR(1) 
process (see Example 2.1) for residual bootstrap (RB), wild bootstrap 
(WB), GBS with multinomial weights [GBS(l)], GBS with 
Uniform(0.5, 1.5) weights [GBS(2)] over 10,000 simulation runs 



n 


RB 


WB 


GBS(l) 


GBS(2) 


15 


0.891 (0.013) 


0.151 (0.008) 


0.555 (0.466) 


0.126 (0.007) 


30 


0.904 (0.006) 


0.146 (0.007) 


0.229 (0.021) 


0.131 (0.005) 


50 


0.928 (0.004) 


0.124 (0.002) 


0.161 (0.005) 


0.121 (0.002) 



The first column denotes the sample size. Resample size is 1000. 



The second column has the average over k of V^ B , the residual bootstrap 
estimate of the variance of <f> for the fcth simulation run. The variance of 
V£ B is given in parentheses. The figures in columns three to five have similar 
interpretation for WB, GBS(l) and GBS(2). 

From Table 1, it can be seen that RB, as expected, fails since it is not 
adapted for heteroscedasticity. GBS(l) is better, but is erratic at low sample 
sizes, a fact reflected in the high variance value of 0.47. WB does reasonably 
well, but is consistently outperformed by GBS (2). However, for larger sample 
sizes the difference between the latter three is nominal. 

Example 2.2. Generalized linear models: Suppose {If/, j = 1, • • • , iVj} 
are independent Bernoulli (pi((3)) random variables with pi{(3) = [1 + exp exp (tj) 

and U = f3o + @\Xi for i = 1, . . . , n. 

We use {(Ni, Xi),i = 1, . . . , n = 10} from the data relating to effectiveness 
of ethylene oxide as a fumigant [Myers, Montgomery and Vining (2002), 
page 129]. Analysis of the actually observed Yi = Y^j=i^ij values reported 
in those data yields maximum likelihood estimates —17.90 for (3q and 6.28 
for [3\. We use these values as the true parameter values and simulate the 
Yij's according to the model described above, and obtain estimates (3q and 
j3\ of 0Q and /3i by solving the likelihood equation. 

We study the WB and two GBS techniques in this example. Let N = 
ELl N i- We use ■■■iWn) from Multinomial (N; 1/N, l/N) for GBS(l) 
and Wis i.i.d. Exponential 1) for GBS(3). The WB is based on residu- 
als, for which there are several choices. In the present example, we define 
pij = (Yij +§)/(l + 25) as the observed proportion of success. The constant 
5 = 0.001 is used to avoid computational pathologies arising from the ob- 
served proportions being or 1. Let tij = log(py/ (1 — Pij)) be the "observed 
logit," while t i = (3q + 0\Xi is the "fitted logit." Define Tij = Uj — ti as the 
ijth residual, and Y*j = U + Uijrij where U^'s are i.i.d. iV(0, 1) random vari- 
ables. 
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Table 2 

Observed logit, average confidence interval length and 
coverage percentage from wild bootstrap (WB), GBS 

with multinomial weights [GBS(l)] and GBS with 
independent exponential weights [GBS (3)] for each of 
the 10 data points from Example 2.2 



Case (Logit) 


WB 


GBS(l) 


GBS(3) 


1 


(2.264) 


0.96 (0.1) 


1.27 (96.4) 


1.25 (96.4) 


2 


(2.213) 


0.94 (0.1) 


1.25 (96.4) 


1.23 (96.4) 


3 


(1.791) 


0.84 (0.1) 


1.07 (96.3) 


1.05 (96.4) 


4 


(1.220) 


0.70 (0.1) 


0.85 (95.2) 


0.83 (95.1) 


5 


(1.099) 


0.67 (0.1) 


0.80 (95.2) 


0.79 (95.0) 


6 


(0.321) 


0.61 (94.8) 


0.66 (92.8) 


0.66 (91.9) 


7( 


-0.182) 


0.69 (99.7) 


0.75 (97.0) 


0.74 (97.2) 


8( 


-0.567) 


0.76 (57.1) 


0.83 (95.8) 


0.83 (95.8) 


9( 


-1.020) 


0.85 (9.6) 


0.97 (94.8) 


0.96 (94.9) 


10 ( 


-2.956) 


1.33 (0.3) 


1.79 (95.2) 


1.76 (94.7) 



The nominal coverage is 95%. Resample size is 1000. 



For each "true logit" ti = —17.90 + 6.28Xj, i = 1, . . . ,n, we obtain per- 
centile based 95% confidence intervals using the three resampling schemes. 
Resampling size taken is B = 1000. This exercise is repeated / = 1000 times, 
and in Table 2 we report the average confidence interval length and coverage 
percentage over these 1000 replications of the experiment. The WB performs 
poorly in this example, since depend on 5 and carry little information 
on the variability of the data. The GBS techniques perform excellently in 
comparison. 

Note that the likelihood is a function of the sufficient statistics Yi = 
Y^f=\Yij, and sometimes only the Yi's, and not the individual Y^'s, are 
available data. There we may use (Yi + 5)/ (TVj + 25) as the "observed pro- 
portion of success" associated with the ith covariate. This improves the 
performance of the WB if N^s are large, and if Y^s are not close to zero or 
iVj. However, in many problems 2Vj > 1 may not be an available option. 

Example 2.3. Nonlinear regression: We consider the isomerization data 
from Huet, Bouvier, Gruet and Jolivet [(1996), page 11]. The reaction rate 
of the catalytic isomerization of n-pentane to isopentane depends on par- 
tial pressure at various stages. The model for the ith reaction rate yi is 
Vi = f(X u 6) + et, where f(X u 6) = ^jggjgjj^ and X % = (H t ,P u Ii) T 
are the corresponding partial pressure values. The e^'s are i.i.d. random vari- 
ables. The parameter 6 = #2, 63, 6<i) T is estimated by minimizing ^ n (9) = 
T,i=i(Vi ~ f(Xi, B)) 2 with the resulting estimate § = (35.9193, 0.0708583, 0.0377385, 0.167166) T . 
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Fig. 1. Plots of GBS with multinomial weights [GBS(l)] (solid line) and residual boot- 
strap (RB) (broken line) densities for the four parameters in Example 2.3. Plot i corre- 
sponds to 9i, i = 1,2,3,4. Resample size is 1000. 



The analysis in Huet, Bouvier, Gruet and Jolivet (1996) includes an RB 
using Studentized quantities for each 6i, and the resulting 95% equal-tail 
confidence interval does not include zero for any of the 0j's. 

We study the RB, WB, GBS(l) and GBS(3) here. Note that (wi,...,w n )~ 
Multinomial (n; 1/n, . . . , 1/n) for GBS(l) and the w^s are i.i.d. Exponential(l) 
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for GBS(3). Figure 1 represents the density histograms from RB and GBS(l) 
overlaid on each other. Notice that for each 9{ the resampling densities have 
two prominent modes, one near the estimate 9 and the other near 6* = 
(33.343956, -1.84281206, -1.0338937, -4.31406116) T . Note that V n (0*) = 3.26, 
a value quite close to = 3.23. The results from GBS(3) are similar to 

those of GBS(l), while in WB the peak at 9* is slightly less prominent. 

The estimates 9 and 9* represent two substantially different chemical 
processes. This being real data, it is not known if 9 or 9* is closer to 9. 
However, the presence of 9* is not revealed in the analysis of Huet et al. The 
bimodal curves in Figure 1 suggest that convex confidence intervals make a 
bad summarization in the present problem. A less sensitive bootstrap such 
as GBS may thus be useful in revealing features in data that theoretically 
superior but sensitive resampling techniques may miss. 

3. Main results. In Section 3.1 we assume p = 1 and obtain asymptotic 
representations of (3 n and (3b in Theorems 3.1 and 3.2. This establishes the 
consistency of the GBS for estimating the distribution. In Section 3.3 we 
consider general p, including the case where p — > oo as n — > oo and obtain 
similar results in Theorems 3.4 and 3.5. 

In Section 3.2 we focus on the variance estimation problem. We assume 
that the 4> n i, l<i<n, are independent and p = 1. In Theorem 3.3 we estab- 
lish an asymptotic representation of the GBS variance estimator, thereby 
generalizing part of the work of Liu and Singh (1992) and Hu (2001). All 
proofs are only sketched and complete details are available from the authors. 

We discuss specific model conditions in the respective Sections. We in- 
troduce some of the notation here: throughout, k and K, with or without 
suffix, are used as generic constants. Two conventions are used: any condi- 
tion stated for a random function is assumed to hold almost surely unless 
otherwise stated; and "for all 0" always means for all (3 in an open neigh- 
borhood of Pq. 

Write <f>ni(Zni, 0) = 4>m{P) = (<f>ni(i){P)> ■ ■ ■ Ani{ P )(P)) T ■ Thus the ath co- 
ordinate of 4>ni is 4>ni(a), a = l,...,p. Let 4>oni(P) = <Pni(P) and for k > 0, 

4>(k+i)ni{a){f3) = -§p<Pkni(a)(f3)- Let ^ fcni(a ) = <t>kni{a) (A)) ■ For each <j) ni ( a )((3), 
we assume that the following Taylor series expansion holds: 

(3.1) <t>ni{a){P + t) =<t>ni{a)(P) + 4>lni(a) W + H 2ni(a) (ft )t 

for /3\ = (3 + ct and for some < c < 1 . 

3.1. Asymptotics and bootstrap for p = l. When p = 1, we simplify nota- 
tion by suppressing the last index and thus: 4> n i(Z n i, f3) = <j> n i{\\ (/?), 4>oni(f3) = 
4>m{(3), 4>(k+i)ni(P) = ^p<Pkni{(3) and <j>kni = <i>kni(Pa)- We then write (3.1) as 



10 S. CHATTERJEE AND A. BOSE 

4>m(f3 + t) = 4>ni(P) + (f>ini(P)t + 2 _] hni{Px)t 2 for 0% = f3 + ct and for some 
0<c< 1. 
Let 



Tin = a n 2 ^2 E ^ni , S n j - ^ 



J 

7W 



i=l i=l 

Assume that a n = [J2?=i E 4>ni\ 1 ^ 2 ~^ 00 ■ 

Assumptions for Section 3.1. Assume that for every n, there is a se- 
quence of cx-fields T n \ C • • -T nn , such that {S n j,T n j, 1 < j < n} is a martin- 
gale. Further, with r/„ = max(<r^, 1), 

(3.2) E<t> ni = foraUl<i<n, n>l, 

(3.3) 0<fc 2 < 7ln , 

(3.4) E[j2(ct>lni ~ Efa^] 2 = oia^- 1 ). 
There exist 5$ > and M<mi such that 

(3.5) sup \<p 2 m(P)\ < M 2m and £;(vM 2n ^ =o(a b n r]- 1 ). 
|/3— /3 |<<5o \i=i / 

The triangular sequence X n i = (J2i=i -E^ni) -1 ^ 2 *^" satisfies 

(3.6) fj^. *► 1, E(max\X ni \^j - 0. 

Theorem 3.1. Under (3.2)-(3.5) i/iere exists a sequence {/3 n } of solu- 
tions of (1.1) suc/t i/iaf 

(3.7) On(A»-A)) = Op(l), 

(3.8) On7in(/?n - Po) = -a" 1 ^^ + r n , 

j=l 

where r n = op(l). In addition, if (3.6) holds, then [X4L1 -E^lm] ^ 2 (Ai — 

A,)^jv(o,i). 

Before we give the proof, we note that in general a sequence of solutions 
need not be measurable. See, for example, Ferguson (1996). However, there 
are enough assumptions in our model to guarantee this measurability. We 
omit these arguments here and also for the subsequent results. 
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Proof of Theorem 3.1. Fix any e > 0. By Chebyshev's inequality 
and (3.2), there exists a K > such that 



(3.9) 



Prob 



i=l 



>K 



<e/2. 



Define S n (t) = a n 1 YZ=i i&m (A) + a n H) - 4>ni{Po)] -lint- Using a Taylor 
series expansion of 4> n i{-) about (3q and (3.4)-(3.5), we can show that, given 
any constant C > 0, for all large n, 



(3.10) 

Now note that 



E 



sup \S n (t)\ 
\t\<c 



0(1). 



(3.11) 



j=l 



>-C SUp ISnWI+CVn-Co" 1 
|i|=C 



i=l 



From (3.9)-(3.11) we have, choosing C large enough, 

inf \a-HY^<t>ni(l3o + a-H)\ >0 
l*l-c[ i=1 J 



> Prob 



E^™ 



i=l 



+ sup |5 n (t)| < C7i„ 
|t|=c 



= 1 - Prob 


a n 


n 

E^* 
i=i 


+ sup | £*„(*) I > C7i„ 

l*l=c- 




> 1 - Prob 


a n l 


n 

E 0™ 

i=i 


> Cfc 2 /4 


- Prob 


sup \S n {t)\>Ck 2 /A 

-\t\=C 



> 1 — e for all n sufficiently large. 

By the continuity of 'Pniifi) in (3, this means that, for fixed e > for 
all n sufficiently large, there exists a C such that 

n 

4>ni(Po + in 1 *) = has a root £ = T n in \t\ < C with probability > 1 — e. 

i=l 

Defining [3 n = Pa + a~ l T n when such T n , exists and as an arbitrary zero of 
Ya=\ ^niifi) = otherwise, we get a solution to (1.1) which satisfies, for fixed 
e > 0, Prob[o n |/3 n — Po\ <C]>1 — e for all n large enough. This shows (3.7). 
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Now with this C fixed, by arguments similar to those of (3.10), we obtain 
that a n j ln n - fa) = -a^ Ya=i <t>m + r n, where r n = o P (l). 

The dependence of (5 n on the choice of e may be taken care of as described 
in Serfling [(1980), page 148]. Briefly, this is as follows: 

Since n = (3 nt£ (uj) A P$ there is a subsequence along which the conver- 
gence is with probability 1, and we may restrict attention to this subse- 
quence only. Thus in our definition of f3 n ^ e (uj) above, for every e > there 
is an N e such that for all n > N e , u> belongs to a probability-1 set f2 £ . Then 
on the probability-1 set Qq = Hfc>i ^i/fc> without loss of generality we have 
a nondecreasing sequence of integers N\{u) < Ni/ 2 (uj) < ■ ■ ■ < Ni/ k {uj) .... 
For n £ [N 1/k (u)),N l/ ( k+l) (u))), we define (3 n = P n>1 / k {u) and let fin = oth- 
erwise. Then the new sequence {P n } has all the desired properties. 

Further, assumption (3.6) ensures that J2i^m = On^n 1 J2i 4>ni N(0, 1) 
by Theorem 5.4.2 of Borovskikh and Korolyuk (1997). □ 



Henceforth we work with that sequence of solutions {/3 n } which satisfies 
Theorem 3.1. 

The bootstrap estimator is obtained by solving (1.2). The next theorem 
is on its asymptotic representation and consistency. Let 



F n {x) = P 
F Bn (x) = P B 



n l/2 



Ini 



L Li=l 



(Pn ~Po)<X 



1/2 



$>lni(A0 0B~Pn)< 



,i=l 



Theorem 3.2. Assume (3.2)-(3.5) and that the bootstrap weights satisfy 
BW. Then there exists a sequence {Pb} of solutions of (1.2) such that 



(3.12) a' 1 



^2<Plni{Pn) 



i=l 



1/2 



0B ~ Pn) = ~a n l E Wi<j) n i{Pn)<j)M + TnB, 



i=l 



where PbG'VbI > e ) = op(l) for any e > 0. If in addition (3.6) and CLTW 
hold, then 



(3.13) 



sup \FBn(x) — F n (x) \ — > in probability. 



Proof. The technique used in proving (3.12) is similar to the proof of 
(3.7) and (3.8), and we omit some of the details here. 

Define <y ln = a' 2 YJi=i 4>im{Pn) and S nB {t) = a' 1 Ya=i Wi[<j) n i{fi n + a~H) - 
4>ni(Pn)] — lint- By arguments similar to those in the proof of Theorem 3.1 
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we have 



inf < a n 1 t E Wi4>ni0n + a n H) [• > 
\t\ — C<j n I 



i=l 



> 1 -P 



B 



i=l 



> Cj ln a n /2 



-Pb 

1 - U w - U 2C 



inf \S nB (t)\ > C A / ln a n /2 

t\=Ca n 



say. 



For given e > and 5 > 0, one can fix C large enough such that for all n 
sufficiently large, we have Prob[Uic > £/2] < S/2,i = 1, 2. 

Then with some algebra it can be established that supui <cro . n \S n B(t)\ = 
VnTnB-, where Pb(|?>ib| > s) = op(l) for any e > 0; then it follows that 
7i n o-~ 1 a n (/3 B - /3 n ) = -a^ 1 X^Li Wi0„i(/3 n ) + r nB . This completes the proof 
of the first part. The second part follows from Theorem 3.1, the first part, 
and Lemma 4.6 of Prasstgaard and Wellner (1993). We omit the details. □ 



3.2. Asymptotics of the bootstrap variance estimator. The estimation of 
the asymptotic variance of n is an important practical problem. In general, 
distributional convergence and variance estimation are different problems. 
For example, the delete-d jackknife (d/n — > 0) is not distributionally con- 
sistent but is variance consistent for the i.i.d. sample mean. In this section 
we establish consistency of the GBS variance estimator via an asymptotic 
representation. 

Assumptions for Section 3.2. We assume that the parameter is real val- 
ued (p = 1), and that the {(p n i} are independent. Also assume that 

(3.14) <f> ni (p + t) = <f> ni ((3) + <hm(P)t + 2- 1 <t> 2ni ((3)t 2 + R ni (t,{3)t 2 , 

where \R n i(t,(3)\ < k\t\ a for each @ for some < a < 1. 
Assume that with L = 8(1 + a): 

n n n 

(3.15) E \^n\ L + E E \^lni\ L + E E \foni\ L = 0(n). 
i=l i=l i=l 

Suppose mo is a specified integer, related to assumption (2.5) on the 
bootstrap weights. For any integer m in {mo, . . . ,n} consider the subset 
Im = {ii, iii ■ ■ ■ , im} of {1,2,..., n}. We assume 

(3.16) m" 1 E ^imW)>h>0 
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for every such choice of subset I m of size m from {1,2, ... ,n}, for every m 
in [mo,n] and j3 satisfying \\j3 — (3q\\ < 5 for a 5 > 0. 

Resampling schemes like the PB and the delete-d jackknives effectively 
select subsets of the data in the resample, and the model assumption (3.16) 
is required to hold on these subsets to make such resampling schemes feasi- 
ble. See Wu (1986) and its discussion for more details on this. Assumption 
(3.16) helps in showing that under appropriate conditions the probability of 
a "bad" subset selection by the bootstrap or jackknife mechanism is small; 
see Proposition 3.1 (proof omitted). Some bootstrap clone methods and the 
downweight-d jackknives do not require assumption (3.16). The assump- 
tions above are not the most general for consistency. However, the stronger 
assumptions allow for more transparent computations. 

Proposition 3.1. Assume the (p n i are independent satisfying (3.14) 
with (3.2)— (3.5), (3.15) and (3.16). Assume j3 n is a solution to (1.1) from 
Theorem 3.1. Let A be the set on which m~ l J2iei m filniiPn) > k±/2 > for 
every such choice of subset I m of size m from {1,2, ... ,n} and for every m 
in [mo,n]. Then Prob[„4] > 1 — 0(n~ 2 ). 

For this section we define our bootstrap estimator /3b to be the solution to 
(1.2) on the set .An W, and (5 n otherwise. This is to facilitate variance compu- 
tations, and the minor alteration in the definition is of negligible consequence 
in the asymptotics. The set A is defined in Proposition 3.1, and W is defined 
in Section 2. The GBS variance estimate is Vqbs = o-n~ 2 ^B{^B — fin) 2 - Note 
that the asymptotic variance of n 1 / 2 5i n (/3 n — /?o) 2 is v n = n -1 Ya=i ^^ni- ^ n 
the statement of the next theorem we have used <f>, (pi, <p2, respectively, 
for (p n i, (frini, <f>2ni- The sums range from 1 to n. Also let g\ n = n _1 ^E^>i, 
92n = rT 1 Y,E())2. 

Theorem 3.3. Assume the (p n i are independent satisfying (3.14) with 
(3.2)-(3.5), (3.15) and (3.16). Assume (3 n is a solution to (1.1) from The- 
orem 3.1. Suppose the weights satisfy BW and either VW(a) or VW(b). 



Then 



n 5ln( V GBS - V n ) 



-1 



E(<^ 2 -^ 2 ) 



2 



= n 



n 2 gin 



(3.17) 



2 



n 2 g\ n 



2 



E^E^E^+op^- 1 ). 



+ 



n 3 gl 
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The terms on the right-hand side of (3.17) are Op(n 1 / 2 ), so Theorem 
3.3 shows in particular that the resampling variance 
is consistent for the asymptotic variance of n}' 2 (/3 n — ft). 



Remark 1. The above asymptotic representation is actually that of the 
mean squared error. However, the bias is of a negligible order compared 
to the variance, and thus the same representation holds for the asymptotic 
variance. 



Remark 2. For the least squares estimator in linear regression, (f>i is 
a constant and consequently (j)2 is zero. There, using expansions for re- 
sampling variances, Liu and Singh (1992) classified resampling techniques 
in two groups: some are consistent even if errors are heteroscedastic, thus 
they are "robust" (.R-class); others work only under homoscedasticity but 
have greater "efficiency" (-E-class) than i?-class techniques. Later, Bose and 
Kushary (1996) and Hu (2001) showed that the above classification breaks 
down if some other M-estimators are used. 

Representation (3.17) is the same [up to Op{n~ l ) terms] as the i?-class 
representation obtained for the PB for LSE in Liu and Singh [(1992), The- 
orem 2(ii)] and for general regression M-estimators in Hu [(2001), Theorem 
2.2(h)]. Note, however, representation (3.17) holds for a much broader class 
of problems than regression M-estimators. 

By computations similar to those in Hu (2001), it can be shown that 
for particular choices of ip(-) the GBS can be simultaneously robust against 
heteroscedasticity of errors as well as more efficient than E'-class techniques. 



Proof of Theorem 3.3. We omit some of the details of the algebra 
involved in this proof. They are similar to those of Theorems 3.1 and 3.2. 

Let us concentrate on the set A n W only, since the contribution from the 
complement of this set is negligible. Define 



U nB (t) = a n l n 1/2 ^^[0„i(/3„ + a n n 1/2 t) - (j) ni n )} 



1/2, 



i=l 



)1 



'H^2wi(j) lni n ) - 2 1 a n n 3/2 t 2 ^ WifoniWn) ■ 



8=1 1=1 

Working along the lines of the proof of Theorem 3.2, we can show that 



Eb 



sup \U nB (t)\ 

\t\<Ca„ 



P (n 
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Now, under An W we may plug in t = a n 1 n 1 / 2 0B — fin) m U n s{t), and 
after quite a lot of algebra we arrive at 

-9\n(yn~ 1 n 1 l 2 (fi B - fin) 

= n" 1/2 J2 w i4>m - n~ 2 g^ J2 J2 W i^ni 

- n ~ 2 9\n ^{4>lni ~ E(j) lni ) ^ Wi4> ni 
+ n" 5/2 5l - 2 4>ni ^ni J2 W ^ni 
+ (TnU-^g^ J2 W i<Pni J2 W i4>lni 

+ 2~ 1 a n n- 3/2 g2ngin (J2 Wi<p n i) 2 + R nB 

= C n + T\ n + T2n + Ts n + T± n + T$ n + R n B say, 

where E B Rl B = P {n~^ l+a )). 

Now it can be easily checked that EbC 2 = Op(l), and = Op(n _1 ), 

for i = I, . . . ,5. In the cross product, by direct computation Eb C n Ti n = 
Opin- 1 ) for i = 4,5, and hence n 5 f n V GBS = E B C 2 + 2E B C n (T ln + T 2n + 
T^ n ) + Op( n_1 )- The rest of the proof follows by calculating the above mo- 
ments. □ 



3.3. Dimension asymptotics. In this section we generalize the results of 
Section 3.1 to dimensions greater than 1 and also allow dimension p = p n — > 
oo as the data size n — > oo. Dimension asymptotics has been a major aspect 
of the study of resampling in the framework of linear regression [Bickel 
and Freedman (1983) and Mammen (1989, 1993)]. The classical residual- 
based bootstrap has been studied for the LSE [Bickel and Freedman (1983)] 
and for general M-estimators [Mammen (1989)] using nonrandom design 
matrices. The random design case and resampling using PB and WB have 
been studied in Mammen (1993). This section is an attempt to explore the 
high-dimensionality aspect in more general problems. 



Assumptions for Section 3.3. The following notation will be used: ||c|| 
is the Euclidean norm of a vector c, A T is the transpose of the matrix A, 
Amax(^4) and \mm(A) are, respectively, the maximum and minimum eigen- 
value of A. 

Assume that 



sup 



>>(< 



1/2 



OO 



as n 



oo. 
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Let 



Snj = E and Tin = a n 2 E "^1™ ■ 

i=l i=l 

Assume that for every n there is a sequence of <7-fields T n \ C • • • C T nn , 
such that {S n j,F n j,j = 1, . . . ,n} is a martingale sequence. Recall that ?? n = 
max (cr^l). 
Assume that 



(3.18) 
(3.19) 



E(f>ni = 0, 

E E E W^ni{a) ~ E 01ni(a) f = °( a n ? ?n 1 )- 



n V 



i=l a=l 



For the symmetric matrix i?2ni(a) m (3-1), for some 5o > there exists a 
symmetric matrix M 2n ,( a ) such that 



(3.20) 
(3.21) 



sup H 2ni f a ) (0o +t)< M 2ni{a ) , 

{t: \\t\\<6 } 



n p 



i=l a=l 



Let 4>x n i(0) be the (p x p) matrix, whose ath row is given by <p{ ni ^ (0) , 

for a = 1,..., p. Let F ln (0) = a" 2 £™ =1 <hm(P)- Let Gi n = a'^ti E( t>im- 
Assume 



(3.22) 



< k 2 < A min (Gi n ). 



Let {c = c n E MP" = MP, \\c\\ = 1} be a fixed sequence of vectors on the 
unit balls of p = p n -dimensional Euclidean spaces. Let 



2 -2 

s n =P 



-2 -2 
S n =P 



lni I C 



.i=l 



T r 



n4>ni 



Li=l 
T 



-1 



E^WAi)) C 
,i=l / . 

/ n \ -1 



E^ 

L 

n 

E^i0n)0ni(A 



E ^lni J c 
i=l / 



i=l 



E^i™(^«) c 



,i=i 



Then X n i is measurable with respect to T n { and satisfies 



(3.23) 
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Theorem 3.4. Under (3.18)-(3.22) there exists a sequence {/3 n } of so- 
lutions of (1.1) such that ifpa~ 2 ^Q, then 



(3.24) 
(3.25) 



a nP 1 ^ 2 ||V 



A))|| = e>p(i), 

n 

P ^/ 2 G T ln 0n ~ A)) = "On V V2 E </>* + ^ 



8=1 



where \\r n \\ = op(l). In addition, if (3.23) is satisfied, then s n l c T (/3 n — Po) 
N(0,1). 



v 



Remark 3. The conditions (3.18)-(3.22) are nearly the same as condi- 
tions (C.1)-(C.3) of Lahiri (1992) except that he requires finite third mo- 
ments for deriving Edgeworth expansions. It may also be noted that in most 
applications the Mini's are uniformly almost surely bounded away from zero. 
Thus condition (3.22) [and (3.3)] is easily satisfied. 

Proof of Theorem 3.4. We first establish that given any e > 0, 3 K > 
such that Prob^a" 1 ^ 1 / 2 Ya=i <t>ni\\ > K] < e/2 using Chebyshev's in- 
equality (3.18) and that £™ =1 E(\\ cj) ni || 2 ) = 0{a 2 n p). Let 

n 

S n (t) = a-V 1/2 E[<M#> + a-V /2 t) - <MA))] - Gj n t, 



i=l 



n p 



M± n — E '^2(4>lnia — E(f>i n i a )((f)i n j a — E(f>i n j a ) T . 
i,j=l a=l 

Since p/a 2 — > 0, for every fixed t eventually /?o + a~ Y p x l 2 t lies in the set 
{Po + x : \\x\\ < <5o}, and using (3.1) we have that 

1 2 



sup \\S n 
\t\\<c 



< 2a" 4 C J A max (M lr 



n p 



+ 2 1 a n G pC i E E ^max(^2ni(a))^max(-^2nj(a))- 
i,j=la=l 

Since M\ n is nonnegative definite, its maximum eigenvalue can be dominated 
by its trace; from (3.19)-(3.21) it follows that i^sup^n^ ||5' n (i)||] 2 =o(l). 
Note that 



Urfj a-V" 1/2 ^E MO) + *n l p l ' 2 t) 



i=l 



> -C sup \\S n (t)\\+C 2 l ln - Ca- l p- 1 ' 2 
\t\=c 



E« 

i=l 
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where l\ n = A m i n (2 1 (G\ n + Gj n )), which from (3.22) is positive. Then by 
choosing C large enough, we have that 



Prob 



infj a" 1 ?" 172 * 3, + «n V /2 *) I" > 



i=l 



> 1 - Prob 



a- n l P - l/2 



i=l 



> Cfc 2 /2 



- Prob 



> 1 -e 



sup \\S n {t)\\ >Ck 2 /2 
\t\=c 

for all n sufficiently large. 



On the set where in£\t\=c~t T J27=i 4>ni(l3o + o-^P 1 ^ 2 ^) > 0> it then follows 
that J2?=i <Ara(/9o + 0'n l P 1 ^ 2 ^) = f° r some t £ {t : \\t\\ < C} from continuity 
of </> n j's and using Theorem 6.3.4 of Ortega and Rheinboldt (1970). Now 
(3.24) and (3.25) follow with a little algebra. The asymptotic normality is 
proved as in Theorem 3.1. □ 

Let F n (x) = Prob[s~ 1 c T (/3 n — (3q) < x] and let $(•) be the standard nor- 
mal distribution function. Our model conditions are sufficient to argue that 
det(<pi n i(P n )) = has asymptotically negligible probability. In practice, this 
case is extremely unlikely. Hence define 

F Bn (x) =PB[Sn 1 (T n - 1 C T (PB-Pn) <x]L dct( < ,g , )M + <S>(x)L 



l {det(</, lni (/3 n ))^0} 

as the bootstrap distribution function estimator. 
The next theorem is an analog of Theorem 3.2. 



l {dct(</, lni Q3„))=0} 



Theorem 3.5. Assume the conditions (3.18)-(3.22) and the bootstrap 
weights satisfy BW. There exists a sequence {/3b} of solutions of (1.2) such 
that if pjo? n — > 0, 



(3.26) 



where \\r n Bi\ 
then 



(3.27) 



<T n - 1 p- 1/2 ^2<l>l n i0 n )0B ~ K) 
i=l 

n 

= -a- X V - X l 2 Wi4> ni n ) + Tnm, 
i=l 

-op(l). In addition, if (3.23) holds and BW and CLTW hold, 
sup \F Bn (%) — F n {x)\ — > in probability. 



A sketch of the proof of this theorem is given after Remark 4. 
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Remark 4. Lahiri (1992) has shown the consistency (and second-order 
accuracy) of an appropriate residual bootstrap for the usual M estimation 
model with i.i.d. errors, known design and fixed p. Theorem 3.5 implies only 
the first-order consistency of the GBS and hence in particular of the PB, 
but for a much larger class of models. 

In general GBS is not second-order accurate. First, (3 n and (3b are biased 
for (3 and (3 n , respectively, and the biases are not negligible in the second 
order. Further, as is known from the extensive literature on resampling, 
without an appropriate Studentization no resampling plan can hope to be 
second-order accurate. With appropriate bias correction and Studentization, 
the GBS can be made to be second-order accurate. 

Define 

9l = n ~ 1 ^2 ( t )2 ni0n) and <f nB = n~ 1 ^Wf(f) 2 ni ((3 n ). 

The following turns out to be the appropriate bias corrected Studentized 
statistic: 



Tn=Jln9n 1 [n 1/2 0n-Po)} 



1 -1/2 --2 



7ln5n 1 72nV G BS, 



T nB = llng nB [ a n L n L/ \(3 B ~ $n)] 

+ 2- l n~ l l 2 a n g~ l B ^ 2n [a n ~ l n l l 2 B - /3 n )] 2 . 

Chatterjee (1999) has shown that T n B is second-order accurate for T n . 
However, there is ample scope for improvement on the conditions assumed 
there. 



Proof of Theorem 3.5. Let us concentrate on the set {||/? n — (3q\\ < 

5o/2}, since the complement of this set can be shown to have negligible 
contribution. 

There we have 



a n 1 p 1/2 a n 1 \\'^2w i (f> n i((3 n ) 



>K 



< kK- 2 p- l a~ 2 



(3.28) 



n p 



£lM 2 + H/i-A>ll 2 £E 



>\nia | 



i=l 



=1 a=l 



n p 



+ ll/3n-A>|| 4 ££AL x (M : 



2nia j 



i=l o=l 



K~ 2 P (l). 



Thus for fixed Si, 62 > 0, by choosing K large enough we have 



(3.29) Prob 



o n P 'a 



n 1 E^mCAi 



>K 



>Si <8 2 . 
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Let 



SnB{t)=cr n 1 p 1/2 a n 1 ^2w i [(/) n i0 n + a n p 1/2 a n 1 t) - (j) ni n )] - Ti n n )t. 

i=i 

On the set {||i|| < C} n {\\j3 n - O \\ < <5 /2}, we have for large n 

/ P n \ 

\\SnB(t)\\ 2 <2a 2 n a- i C 2 X m J ^ £ tWj0i«ia(&)0inja(&) T 

\a=lij=l / 
P / n \ 2 

a=l\j=l / 

= T 1 +T 2 say. 

With some algebra it can be shown that ^| =1 PB[Tj > K] = op(l), thus 



sup ||S nB (t)||>2ir 
ll*ll<c 



<^P B [T :; >K] + Op(a-V/ 2 ) 

3=1 



Op(l). 



(3.30) 

Now observe that 

inf < a n - l a~ 1 p~ 1/2 t T Y, w i ( l>ni{(3n + o n a~ x p x l 2 t) 



--C 



i=l 



> -C sup \\S nB (t)\\ +C 2 l ln - Ca^a^p' 1 ' 2 

\t\=c 



Y W i'Pni0n 



8=1 



where Zi n = A min (2 l (T ln (f3 n ) + Tf n n ))). Notice that h n > k 2 /2 with prob- 
ability 1 — o(l), for the constant k 2 from (3.22). By choosing C large enough, 
from (3.28), (3.29) and (3.30) we have that on the set {\\(3 n - (3 \\ < 6 /2}, 



inf \a n 1 a n l p Wp^ w^ ni n + a n a~ 1 p 1 l 2 t) \ > 
1*1— c I 



i=l 



> 1-P T 



i=l 



> Cfc 2 /2 



sup ||5 nB (t)ll>Cfc2/2 
L|t|=C 

Thus for fixed 8±, 6 2 > 0, we have that for C large enough for all large n, 



Prob 



inf \ a n - 1 cC 1 p~ 1/2 P 
\t\=c 
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< Prob 



(7 



+ Prob 



^Twi<j) n i0 n + a n a n 1 p l/2 t) | > 

n 

^Wi4) ni >Ck 2 /2 
i=i 

>Ck 2 /2 >Si/2 



\-y i/2 



sup \\S n 
\t\=c 



<l-5 1 
>S 1 /2 
+ 0(a~ V /2 ) 



<5 2 . 



Onthesetinf| t | =c {a n 1 a n 1 p 1/2 t T ELi^m(/3n + o- n a n V/ 2 i)} > 0, us- 
ing the continuity of Ya=i Wi&nii') an d Theorem 6.3.4 of Ortega and Rhein- 
boldt (1970), we have that E?= 

iWi(j) n i(/3 n + cr n a n ^p^^t) — has a root T n 
in |t| < C. Putting (3b = $ n + ^nfln 1 P 1 ^^n 1 we get a solution to (1.2) which 



satisfies, for fixed e, 5 > 0, Prob[PB[o"n 1 o-nP 



-1/2 1 



P n \\<C]<l-e]<S 



for all n large enough. Now notice that with this C fixed, we have actually 
shown that with t = T n 



ffn -1 anP~ 1/2 rin(A0G9B-A0 : 
where ||r n £i|| =op(l). This shows (3.26). □ 



n 

an 1 P' 1/2 Y. Wi( t )ni ^n)+rnBl, 
i=l 
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